Conference Proceedings
Correlation Clustering in Data Streams
KJ Anh, G Cormode, S Guha, A McGregor, A WIRTH, F Bach (ed.), D Blei (ed.)
JMLR Workshop and Conference Proceedings | Published : 2015
Abstract
In this paper, we address the problem of correlation clustering in the dynamic data stream model. The stream consists of updates to the edge weights of a graph on n nodes and the goal is to find a node-partition such that the end-points of negative-weight edges are typically in different clusters whereas the end-points of positive-weight edges are typically in the same cluster. We present polynomial-time, O(n · polylog n)-space approximation algorithms for natural problems that arise. We first develop data structures based on linear sketches that allow the "quality" of a given node-partition to be measured. We then combine these data structures with convex programming and sampling techniques..
View full abstractGrants
Awarded by Division of Arctic Sciences
Funding Acknowledgements
K. J. Ahn: The author is currently at Google, kookjin@google.com. G. Cormode: Supported in part by European Research Council grant ERC-2014-CoG 647557, a Royal Society Wolfson Research Merit Award and the Yahoo Faculty Research Engagement Program. S. Guha: supported by NSF Award CCF-1546141. A. McGregor: Supported by NSF Award CCF-1637536, CCF-1908849, and CCF-1934846. A. Wirth: Supported in part by ARC Future Fellowship FT120100307.